11 research outputs found
Exploiting compiler-generated schedules for energy savings in high-performance processors
This paper develops a technique that uniquely combines the advantages of static scheduling and dynamic scheduling to reduce the energy consumed in modern superscalar processors with out-of-order issue logic. In this Hybrid-Scheduling paradigm, regions of the application containing large amounts of parallelism visible at compile-time completely bypass the dynamic scheduling logic and execute in a low power static mode. Simulation studies using the Wattch framework on several media and scientific benchmarks demonstrate large improvements in overall energy consumption of 43 % in kernels and 25 % in full applications with only a 2.8 % performance degradation on average
Recommended from our members
A hybrid-scheduling approach for energy-efficient superscalar processors
textThe management of power consumption while simultaneously delivering
acceptable levels of performance is becoming a critical task in highperformance,
general-purpose micro-architectures. Nearly a third of the energy
consumed in these processors can be attributed to the dynamic scheduling
hardware that identifies multiple instructions to issue in parallel. The energy
consumption of this complex logic structure is projected to grow dramatically
in future wide-issue processors.
This research develops a novel Hybrid-Scheduling approach that synergistically
combines the advantages of compile-time instruction scheduling and
dynamic scheduling to reduce energy consumption in the dynamic issue hardware.
This approach is predicated on the key observation that all instructions
and all basic-blocks in a program are not equal; some blocks are inherently
easy to schedule at compile-time, whereas others are not. In this scheme,
programs are thus partitioned into low power “static regions” and high power
“dynamic regions”. Static regions are regions of the program for which the
compiler can generate schedules comparable to the dynamic schedules created
by the run-time hardware. These regions bypass the dynamic issue units and
execute on specially designed low-power, low-complexity hardware.
An extensive evaluation of the proposed scheme reveals that the HybridScheduling
approach wherein instructions are routed to a scheduling engine
tuned to a region’s characteristics can provide substantial reduction in processor
energy consumption while concurrently preserving high levels of performance.Electrical and Computer Engineerin
Is Compiling for Performance == Compiling for Power?
Energy consumption and power dissipation are increasingly becoming important design constraints in high performance microprocessors. Compilers traditionally are not exposed to the energy details of the processor. However, with the increasing power/energy problem, it is important to evaluate how the existing compiler optimizations influenceenergy consumption and power dissipation in the processor. In this paper we present a quantitative study wherein we examine the effect of the standard optimizations levels -O1 to -O4 of DEC Alpha's cc compiler on power and energy of the processor. We also evaluate the effect of four individual optimizations on power/energy and attempt to classify them as "low energy" or "low power" optimizations. In our experiments we find that optimizations that improve performance by reducing the number of instructions are optimized for energy. Such optimizations reduce the total amount of work done by the program. This is in contrast to optimizations that improve ..
Modulo-Variable Expansion Sensitive Scheduling
Modulo scheduling is an aggressive scheduling technique for loops that exploit instruction-level parallelism by overlapping successive iterations of the loop. Due to the nature of modulo scheduling, the lifetime of a variable can overlap with a subsequent definition of itself. To handle such overlapping lifetimes, modulo-variable expansion (MVE) is used, wherein the constructed schedule is unrolled a number of times. We propose a technique to improve the constructed schedule while performing MVE. In our approach, we unroll the data dependence graph of the original loop and re-schedule it with a MVE-sensitive scheduler. Such an approach is expected to result in better initiation rates as compared to the traditional approach. We have implemented our approach and evaluated its performance on a large number of scientific benchmark kernels
Thesis Abstract M.Sc. (Engng) IISc THESES ABSTRACTS Evaluation of register allocation and instruction scheduling methods in multiple issue
Exploiting greater instruction-level parallelism (ILP) through multiple instruction issue and execution has gained significant importance in modern processors for achieving higher performance. Compilation techniques can analyze the program to expose parallelism, transform the program to enhance the parallelism, and schedule the program to exploit parallelism. Thi